Simple Interactive Statistical Analysis
Two by two tables
Input.
Fill the values in the table. Are supposed to be integer values, whole positive numbers without decimals.
Explanation.
Pearson
Chi-square | Likelihood
Ratio Chi-Square |
Yates Chi-square | Mantel Heanszel Chi-square
| Risk Ratio | Odds Ratio
Log Odds Ratio | Yules-Q | Yules-Y | Phi-square | Pearson correlation |
Kappa | McNemar Test | Fisher Exact
Two by two tables provides you with various statistics and measures of association. In this page we will try to explain the measures concerned, however, many of these measures are difficult to understand. Before applying these statistics acquaint yourself very well with your data, the meaning of your variables, and build up a good understanding of what 'relationship' and 'independence' means in the case of your data.
All four Chi-squares presented give probability values for the relationship between two dichotomous variables. They calculate the difference between the data observed and the data expected, considering the given marginals and the assumptions of the model of independence. The four Chi-squares give only an estimate of the true Chi-square and associated probability value, an estimate which might not be very good in the case of the marginals being very uneven or with a small value (~less than five) in one of the cells. In that case the Fisher Exact is a good alternative for the Chi-square. However, with a large number of cases the Chi-square is preferred as the Fisher is difficult to calculate.
Learn more about the Chi-square-test from Statistics at Square One.
Pearson's Goodness of Fit Chi-square (GFX) is most often used in research. Pearson's Chi-square is mathematically related to the classical Pearson's Correlation co-efficient and to Analysis of Variance.
Likelihood Ratio Chi-square (LRX) was developed more recently than the Pearson chi-square and is the second most frequently used Chi-square. It is directly related to log-linear analysis and logistic regression. The LRX has the important property that an LRX with more than one degree of freedom can be partialised into a number of smaller tables each with its own (smaller) LRX and (lower numbers of) degrees of freedom. The sum of the partial LRXs and associated partial degrees of freedom, as found in the smaller tables, equals the original LRX and original number of degrees of freedom.
Yate's Chi-square is equivalent to Pearson's Chi-square with continuity correction.
Mantel Heanszel Chi-square Mantel-Haenszel Chi-square is thought to be closer to the 'true' Chi-square if small numbers of cases are involved. It is not often used. If you have doubts about your results, use Fisher Exact instead.
Risk-ratio. The risk ratio takes on values between zero ('0') and infinity. One ('1') is the neutral value and means that there is no difference between the groups compared, close to zero or infinity means a large difference between the two groups on the variable concerned. A risk ratio larger than one means that group one has a larger proportion than group two; if the opposite is true the risk ratio will be smaller than one. If you swap the two proportions, the risk ratio will take on its inverse (1/RR).
The risk ratio gives you the percentage difference in classification between group one and group two. For example, the proportion of people suffering from complications after traditional surgery equals 0.10 (10%), while the proportion suffering from complications after alternative surgery equals 0.125 (12.5%). The risk ratio equals 0.8 (0.1/0.125); 20% ((1-0.8)*100) fewer patients treated by the traditional method suffer from complications. Another example: 8% of freezers produced without quality control have paint scratches. This percentage is reduced to 5% if quality control is introduced. The risk ratio equals 1.6 (8/5); 60% more freezers are damaged if there is no quality control.
Odds-ratio The odds ratio takes values between zero ('0') and infinity. One ('1') is the neutral value and means that there is no difference between the groups compared; close to zero or infinity means a large difference. An odds ratio larger than one means that group one has a larger proportion than group two, if the opposite is true the odds ratio will be smaller than one. If you swap the two proportions, the odds ratio will take on its inverse (1/OR).
The odds ratio gives the ratio of the odds of suffering some fate. The odds themselves are also a ratio. To explain this we will take the example of traditional versus alternative surgery. If 10% of operations results in complications, then the odds of having complications if traditional surgery is used equals 0.11 (0.1/0.9, you have a 0.11 times higher chance of getting complications than of not getting complications). 12.5% of the operations using the alternative method result in complications, giving odds of 0.143 (0.125/0.875). The odds ratio equals 0.778 (0.11/0.143). You have a 0.778 times higher chance of getting complications than of not getting complications, in traditional as compared with alternative surgery. The inverse of the odds ratio equals 1.286. You have a 1.286 times higher chance of getting complications than of not getting complications, in alternative as compared with traditional surgery. This takes some getting used to, we admit, but it has its advantages.
The odds ratio can be compared with the Risk Ratio. The risk ratio is easier to interpret than the odds ratio. However, in practice the odds ratio is used more often. This has to do with the fact that the odds ratio is more closely related to frequently used statistical techniques such as logistic regression. Also, the odds ratio has the attractive property that, however you turn the table, it will always take on the same value or the inverse (1/odds) of that value.
Log-odds, the natural logarithm of the odds-ratio, does not have an easy to understand meaning. However, the log-odds is symmetric, running from minus infinity to plus infinity, with zero being the neutral value. This makes it easier to compare negative with positive associations.
Yules Q Yule's Q is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. In two by two tables Yule's Q is equal to Goodman and Kruskal's Gamma. The interpretation of Q as Gamma is easiest to understand. Each observation is compared with each other observation, these are called pairs, the relationship between two observations. If an observation is higher in value as another observation on both the horizontal and the vertical marginals, the pair of observations is called concordant, if this is not the case the pair is discordant. The Gamma is the ratio of concordant pairs on the total number of pairs. A high Gamma means that there is a high proportion of concordant pairs, high values on the vertical marginal tend to go with high values on the horizontal marginal.
Yules Y is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. The measure tends to estimate associations more conservatively than Yule's Q. The measure has little substantive or theoretical meaning.
Phi-square is simply the Pearson Chi-square divided by the number of cases. It takes on the value 0 (zero) if there is no association between the variables, the value is one if there is perfect association. The Phi-square is equal to the square of the Pearson correlation coefficient. This relationship with the correlation coefficient means that the Phi-square gives you the proportion of variance in one variable explained by the variance in the other variable (This is considered very meaningful by many. However, Reynolds correctly remarked that explained variance might have mathematical meaning, it does not necessarily mean anything substantive or theoretical).
Pearson correlation coefficient. Square for explained variance.
Kappa measure of agreement. Kappa takes on the value zero if there is no more agreement between two judges or tests as can be expected on the basis of chance. Kappa takes on the value 1 if there is perfect agreement; all observations are on the diagonal from the upper left to the bottom right, the diagonal of agreement. It is considered that Kappa values lower than 0.4 represent poor agreement, values between 0.4 and 0.75 fair to good agreement, and values higher than 0.75 excellent agreement. Negative Kappa indicates an application problem
McNemar Change Test. This test studies the change in a group of respondents measured twice on a dichotomous variable. It is customary in that case to tabulate the data in a two by two table. If as many respondents changed from A to B as changed from B to A then the number of respondents in the bottom left and top right cell, the diagonal of changers, would be equal. If the number in the two cells is not equal this indicates a certain direction in the change observed. The McNemar indicates to what extent the observed direction in the change is caused by chance. For a slightly more thorough discussion of this test consult the pairwise help page.
Fisher Exact. Gives various exact probablities for this table. An extensive description of Fisher Exact analysis is provided on the Fisher help page .
Technical Discussion.
The algorithm to calculate the significance of the Chi-square comes from Poole et al, the algorithm is also mentioned in the 'Epi-Info' manual (1994).